Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)

نویسندگان

  • Artem N. Chernodub
  • Dimitri Nowicki
چکیده

We propose a novel activation function that implements piecewise orthogonal non-linear mappings based on permutations. It is straightforward to implement, and very computationally efficient, also it has little memory requirements. We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU. OPLU activation function ensures norm preservance of the backpropagated gradients; therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Orthogonality preserving mappings on inner product C* -modules

Suppose that A is a C^*-algebra. We consider the class of A-linear mappins between two inner product A-modules such that for each two orthogonal vectors in the domain space their values are orthogonal in the target space. In this paper, we intend to determine A-linear mappings that preserve orthogonality. For this purpose, suppose that E and F are two inner product A-modules and A+ is the set o...

متن کامل

Metric entropy and sparse linear approximation of lq-hulls for 0<q≤1

Consider `q-hulls, 0 < q ≤ 1, from a dictionary of M functions in L space for 1 ≤ p < ∞. Their precise metric entropy orders are derived. Sparse linear approximation bounds are obtained to characterize the number of terms needed to achieve accurate approximation of the best function in a `q-hull that is closest to a target function. Furthermore, in the special case of p = 2, it is shown that a ...

متن کامل

Approximating Orthogonal Matrices by Permutation Matrices

Motivated in part by a problem of combinatorial optimization and in part by analogies with quantum computations, we consider approximations of orthogonal matrices U by “non-commutative convex combinations”A of permutation matrices of the type A = P Aσσ, where σ are permutation matrices and Aσ are positive semidefinite n × n matrices summing up to the identity matrix. We prove that for every n× ...

متن کامل

DizzyRNN: Reparameterizing Recurrent Neural Networks for Norm-Preserving Backpropagation

The vanishing and exploding gradient problems are well-studied obstacles that make it difficult for recurrent neural networks to learn long-term time dependencies. We propose a reparameterization of standard recurrent neural networks to update linear transformations in a provably norm-preserving way through Givens rotations. Additionally, we use the absolute value function as an element-wise no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1604.02313  شماره 

صفحات  -

تاریخ انتشار 2016